The AI Ladder by Rob Thomas

The AI Ladder by Rob Thomas

Author:Rob Thomas
Language: eng
Format: epub
Publisher: O'Reilly Media
Published: 2020-04-29T00:00:00+00:00


The Challenges of Collecting New Sources of High-Volume Unstructured Data

Managing data that arrives in real time requires new kinds of tools, such as Apache Kafka, which facilitates streaming data into databases. And unstructured data frequently requires new kinds of databases. As we saw in the previous chapter, RDBMSs can only store data effectively if that data’s structure is specified in a schema. If your incoming data is unstructured (or its shape is unknown), there’s no schema to work with. This is where NoSQL databases come in. The term NoSQL is starting to become out of favor, but it’s important to realize that the technologies referenced by this term are not; they provide other ways to manage data—document databases, graph databases, column stores, key value stores—and those tools will more than likely become part of your information agenda. Many of them are optimized for high-speed write access, which is exactly what you need for real-time data.

Getting access to all kinds of data means being “polyglot,” not in terms of programming language, but in terms of data storage and models. One of the toughest tasks on the first rung of the AI Ladder is designing uniform ways for everyone across the company to access the data. SQL has become a common language across all kinds of databases, including many NoSQL databases. Build web services that allow online access to databases from any point in the organization, using technology that’s already available at everyone’s desk (or, for that matter, a lunchroom, conference center, or airliner). And cloud providers can create robust data collections that can be accessed worldwide and are resistant (though not immune) to outages.

The technical aspects of data access are solvable. What about the organizational aspects?



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.